AITopics | Learning Management

Collaborating Authors

Learning Management

News Overviews Instructional Materials AI-Alerts Classics

Optimal Gap-Dependent Regret for Private Stochastic Decision-Theoretic Online Learning

arXiv.org Machine LearningMay-29-2026

We study stochastic decision-theoretic online learning with full information and event-level pure differential privacy. A COLT open problem of Hu and Mehta asks to determine the optimal gap-dependent regret rate for stochastic decision-theoretic online learning under pure event-level differential privacy. For $K$ actions, losses in $[0,1]$, and a unique best action separated from the second-best action by gap $Δ_{\min}$, the known lower bound is of order $ \frac{\log K}{\min\{Δ_{\min},\varepsilon\}}, $ or equivalently, up to universal constants, of order \[ \frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}. \] We give a horizon-free pure-DP algorithm and prove the explicit regret bound \[ \operatorname{Reg}_T \le 1000 \cdot \left(\frac{\log K}{Δ_{\min}}+\frac{\log K}{\varepsilon}\right) \] for every horizon $T$. The numerical constant is not optimized. The algorithm partitions time into blocks of exponentially increasing size, plays a single action throughout each block, and chooses the next action by an exponential mechanism applied to a data-independent random prefix of the previous block. The random prefix converts block regret into a sum, over all prefix lengths, of softmax selection errors. A single entropy-potential argument controls all privacy-dominated large-gap actions at cost $\log K/\varepsilon$.

artificial intelligence, machine learning, privacy, (15 more...)

arXiv.org Machine Learning

2605.29148

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.92)

Add feedback

Online Learning on Hidden-Convex Losses via Algorithmic Equivalence: Optimal Regret, Geometric Barrier, and Bandit Feedback

Barakat, Anas, Kontogiannis, Andreas, Pollatos, Vasilis, Panageas, Ioannis, Varvitsiotis, Antonios

arXiv.org Machine LearningMay-27-2026

We study adversarial online learning with hidden-convex losses, i.e., nonconvex losses that become convex after a nonlinear reparameterization. Ghai, Lu and Hazan (2022) proved that, under geometric and smoothness assumptions, online gradient descent (OGD) on such nonconvex losses approximately simulates online mirror descent (OMD) on the underlying convex losses with a suitable regularizer, yielding $\mathcal{O}(T^{2/3})$ regret. They left open whether the optimal $Θ(\sqrt{T})$ regret from online convex optimization can be recovered in this hidden-convex setting. We answer this question affirmatively. More specifically, via a sharper discrete-time algorithmic equivalence argument, we prove that OGD achieves $\mathcal{O}(\sqrt{T})$ regret under the same assumptions, matching the optimal worst-case rate for adversarial online convex optimization. We also address another open question of Ghai, Lu and Hazan (2022) by clarifying the geometry required for this algorithmic equivalence. We replace the diagonal-Jacobian sufficient condition with a necessary-and-sufficient Hessian compatibility condition, thereby expanding the class of admissible reparameterizations. We complement our tight regret bound with a lower bound showing that the Hessian compatibility assumption is essential for OGD; when it fails, we construct a smooth reparameterization and an adversarial sequence of hidden-convex losses for which OGD suffers $Ω(T)$ regret. Finally, we extend our analysis to one-point bandit feedback and prove a $\mathcal{O}(T^{3/4})$ expected regret bound for bandit OGD with spherical smoothing, matching its classical rate on convex losses.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

2605.26373

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Online Learning-to-Defer with Varying Experts

Duy, Dang Hoang, Montreuil, Yannis, Meyer, Maxime, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang

arXiv.org Machine LearningMay-21-2026

Learning-to-Defer (L2D) methods route each query either to a predictive model or to external experts. While existing work studies this problem in batch settings, real-world deployments require handling streaming data, changing expert availability, and shifting expert distribution. We introduce the first online L2D algorithm for multiclass classification with bandit feedback and a dynamically varying pool of experts. Our method achieves regret guarantees of $O((n+n_e)T^{2/3})$ in general and $O((n+n_e)\sqrt{T})$ under a low-noise condition, where $T$ is the time horizon, $n$ is the number of labels, and $n_e$ is the number of distinct experts observed across rounds. The analysis builds on novel $\mathcal{H}$-consistency bounds for the online framework, combined with first-order methods for online convex optimization. Experiments on synthetic and real-world datasets demonstrate that our approach effectively extends standard Learning-to-Defer to settings with varying expert availability and reliability.

artificial intelligence, def, machine learning, (15 more...)

arXiv.org Machine Learning

2605.1234

Country: Asia (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.67)
Education > Educational Setting > Online (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Add feedback

EDU Unlimited turns online learning into a one-time 20 purchase instead of ongoing tuition costs

PCWorldMay-15-2026, 08:00:00 GMT

When you purchase through links in our articles, we may earn a small commission. TL;DR: Score lifetime access to EDU Unlimited for just $19.97 through May 31 (MSRP $600) and unlock 1,000+ online courses across tech, business, creative skills, and more with a single payment. Online learning can get expensive fast, especially when a single course or boot camp can run into the hundreds or even thousands of dollars. EDU Unlimited by StackSkills flips that model by giving you one-time lifetime access to a massive library of 1,000+ courses across a wide range of subjects for just $19.97 during this limited-time offer (MSRP $600). From coding and marketing to creative hobbies like photography or design, StackSkills lets you build your dream skill-set at your own pace, without the pressure.

artificial intelligence, buyer, consumer ai performance privacy productivity, (11 more...)

PCWorld

Industry:

Information Technology (1.00)
Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.36)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence (1.00)
Information Technology > Hardware (0.93)

Add feedback

Sample-Mean Anchored Thompson Sampling for Offline-to-Online Learning with Distribution Shift

Li, Bochao, Fu, Yao, Chen, Wei, Kong, Fang

arXiv.org Machine LearningMay-15-2026

Offline-to-online learning aims to improve online decision-making by leveraging offline logged data. A central challenge in this setting is the distribution shift between offline and online environments. While some existing works attempt to leverage shifted offline data, they largely rely on UCB-type algorithms. Thompson sampling (TS) represents another canonical class of bandit algorithms, well known for its strong empirical performance and naturally suited to offline-to-online learning through its Bayesian formulation. However, unlike UCB indices, posterior samples in TS are not guaranteed to be optimistic with respect to the true arm means. This makes indices constructed from purely online and hybrid data difficult to compare and complicates their use. To address this issue, we propose sample-mean anchored TS (Anchor-TS), which introduces a novel median-based anchoring rule that defines the arm index as the median of an online posterior sample, a hybrid posterior sample, and the online sample mean. The median anchoring systematically corrects bias induced by distribution shift by mitigating over-estimation for suboptimal arms and under-estimation for optimal arms, while exploiting offline information to obtain more accurate estimates when the shift is small. We establish theoretical guarantees showing that the proposed algorithm safely leverages offline data to accelerate online learning, and quantifying how the degree of distribution shift and the size of offline data affect the resulting regret reduction. Extensive experiments demonstrate consistent improvements of our algorithm over baselines.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2605.10289

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.67)

Add feedback

MasterClass is 50% off today. It's worth it just for the entertainment

PCWorldMay-7-2026, 14:02:11 GMT

When you purchase through links in our articles, we may earn a small commission. MasterClass is 50% off today. Until May 10th, MasterClass annual plans start at $60/year. It's great for casual learners who want high-quality, entertaining courses from big names. With the job market being what it is, there's never been a better time to learn new skills (or brush up on old ones).

artificial intelligence, buyer, consumer ai performance privacy productivity, (10 more...)

PCWorld

Genre: Instructional Material (0.73)

Industry:

Information Technology > Security & Privacy (0.75)
Leisure & Entertainment > Games > Computer Games (0.56)
Education > Educational Setting > Online (0.49)
Education > Educational Technology > Educational Software > Computer Based Training (0.30)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Hardware (0.90)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.30)

Add feedback

The Bernstein-von Mises theorem for Bayesian one-pass online learning

Lee, Jeyong, Choi, Junhyeok, Kim, Dongguen, Chae, Minwoo

arXiv.org Machine LearningMay-1-2026

Bayesian online learning provides a coherent framework for sequential inference. However, its theoretical understanding remains limited, particularly in the one-pass setting. Existing theoretical guarantees typically require the mini-batch sample size to diverge, a condition that fails in the one-pass regime. In this paper, we propose a new Bayesian online learning algorithm tailored to the one-pass setting, which incorporates a warm-start phase to ensure stable sequential updates. For this algorithm, we show that the sequentially updated posterior attains the optimal convergence rate. Building on this, we establish an online analogue of the Bernstein-von Mises theorem, which guarantees valid uncertainty quantification without diverging mini-batch sample sizes. Our analysis is based on a novel theoretical framework that differs fundamentally from existing approaches in the online learning literature. Numerical experiments on generalized linear models show that the proposed method matches the performance of the batch estimator while outperforming existing online procedures.

artificial intelligence, inequality hold, machine learning, (18 more...)

arXiv.org Machine Learning

2604.27442

Genre: Research Report (0.83)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Learning on the Edge: Online Learning with Stochastic Feedback Graphs

Neural Information Processing SystemsApr-28-2026, 03:59:45 GMT

The framework of feedback graphs is a generalization of sequential decisionmaking with bandit or full information feedback. In this work, we study an extension where the directed feedback graph is stochastic, following a distribution similar to the classical Erdős-Rényi model. Specifically, in each round every edge in the graph is either realized or not with a distinct probability for each edge.

artificial intelligence, graph, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Industry: Education > Educational Setting > Online (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.41)

Add feedback

Detecting and Adapting to Irregular Distribution Shifts in Bayesian Online Learning

Neural Information Processing SystemsApr-25-2026, 11:19:19 GMT

We consider the problem of online learning in the presence of distribution shifts that occur at an unknown rate and of unknown intensity. We derive a new Bayesian online inference approach to simultaneously infer these distribution shifts and adapt the model to the detected changes by integrating ideas from change point detection, switching dynamical systems, and Bayesian online learning. Using a binary'change variable,' we construct an informative prior such that-if a change is detected-the model partially erases the information of past model updates by tempering to facilitate adaptation to the new data distribution. Furthermore, the approach uses beam search to track multiple change-point hypotheses and selects the most probable one in hindsight. Our proposed method is model-agnostic, applicable in both supervised and unsupervised learning settings, suitable for an environment of concept drifts or covariate drifts, and yields improvements over state-of-the-art Bayesian online learning approaches.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Industry:

Education > Educational Setting > Online (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

Oracle-Efficient Online Learning for Smoothed Adversaries

Neural Information Processing SystemsApr-24-2026, 21:46:41 GMT

We study the design of computationally efficient online learning algorithms under smoothed analysis. In this setting, at every step an adversary generates a sample from an adaptively chosen distribution whose density is upper bounded by 1/ times the uniform density. Given access to an offline optimization (ERM) oracle, we give the first computationally efficient online algorithms whose sublinear regret depends only on the pseudo/VC dimension dof the class and the smoothness parameter .

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Industry: Education > Educational Setting > Online (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.66)

Add feedback